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We analyst in detail a new approach to the monitoring and forecasting of the onset of transitions 
in high dimensional complex systems (see Phys. Rev. Lett . 113 , 264102 (2014)) by application 
to the Tangled Nature Model of evolutionary ecology and high dimensional replicator systems with 
a stochastic element. A high dimensional stability matrix is derived for the mean field approxi¬ 
mation to the stochastic dynamics. This allows us to determine the stability spectrum about the 
observed quasi-stable configurations. From overlap of the instantaneous configuration vector of the 
full stochastic system with the eigenvectors of the unstable directions of the deterministic mean field 
approximation we are able to construct a good early-warning indicator of the transitions occurring 
intermittently. Inspired by these findings we are able to suggest an alternative simplified applicable 
forecasting procedure which only makes use of observable data streams. 
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Introduction - High dimensional complex systems both 
physical and biological exhibit intermittent dynamical 
evolution consisting of stretches of relatively little change 
interrupted by often sudden and dramatic transitions to 
a new meta-stable configuration pQ. 

Such transitions can have crucial consequences when 
they occur in, say, ecosystems or financial markets and it 
is therefore important to develop methods that are able 
to identify precursors, warning signals and ideally tech¬ 
niques to forecast the transitions before they take place. 
We will expect that the mechanisms behind the rapid re¬ 
arrangement may be different in different systems. Schef¬ 
fer and collaborators have developed a method pertinent 
to systems in which the transition takes the form as a 
bifurcation captured by a robust macroscopic variable, 
which emerges from the micro dynamics. A precursor of 
the systemic change can then be identified from the crit¬ 
ical slowing down and enhanced fluctuations exhibited 
by this macroscopic collective degree of freedom mu as 
a change in some external parameter drives the system 
towards the bifurcation point. 

Here we consider an alternative scenario suggested re¬ 
cently in [5] in which the transitions are induced by in¬ 
trinsic fluctuations at the level of the individual compo¬ 
nents which propagates to the macroscopic systemic level 
and thereby triggers a change in the overall configuration. 
Our approach is relevant to systems in which the avail¬ 
able configuration space evolves as a consequence of the 
dynamics. One may think of a new and more virulent 
virus being created through a mutation of an existing 
strain (e.g. the SARS virus in 2003), or a new economic 
agent arriving in the market (e.g. the dot-com bubble in 
1997-2000). 

We describe below our methodology through appli¬ 
cations to two models. First we consider the Tangled 


Nature (TaNa) Model of evolutionary ecology [6], which 
has had considerable success in reproducing both macro¬ 
evolutionary aspects such as the intermittent mode of ex¬ 
tinctions [7j and ecological aspects such as species abun¬ 
dance distributions 0 and species area laws [9]. 

We also present results for transitions in a model with 
a very different type of dynamics, namely a high di¬ 
mensional replicator with a stochastic element of muta¬ 
tion PME]. We demonstrate below that the replicator 
system with this element of stochasticity exhibit inter- 
mittency. Given the broad relevance of the replicator 
dynamics (population dynamics, game theory, financial 
dynamics, social dynamics etc.), success in forecasting 
transitions in this model may indicate that our method 
can be useful in many very different situations m 

Despite their different general mechanisms, the two 
models can be pictured in the same way. Their stochas¬ 
tic dynamics is characterized by a huge number of fixed 
points, and when the system randomly falls into one 
of them it enters a quiescent period of little change. 
Eventually the intrinsic stochastic fluctuations will allow 
the population of hitherto empty parts of configuration 
space, which may effectively serve as a random kick that 
is able to drive the system away from the local minimum 
and towards the chaotic regime where the system un¬ 
dergoes a high dimensional adaptive walk searching for 
another (metastable) fixed point. 

Indeed both the nature of the fixed points and their 
stability varies significantly. Some fixed points are con¬ 
trolled by only a few interacting components while others 
involve many. Some are very stable while others less so 
leading to a very broad distribution of time spend in the 
metastable configurations of a given fixed point. The 
dynamics of the transitions between metastable configu¬ 
rations - the adaptive walk mentioned above - can also 
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differ much. It can happen that the system is ” trapped” 
between two or more attractors and switches between 
them before being pushed away. The transitions that 
lead from a fixed point to the other can be both sudden 
or slow and differ in magnitude. The point to be stressed 
is that the phenomenon we are trying to predict is highly 
heterogeneous and one has to bare this well in mind when 
interpreting the results. 

That said, our claim is that we are able, in both mod¬ 
els, to understand which kind of intrinsic stochastic fluc¬ 
tuation will be able to push the system out of its stable 
configuration. Indeed through a mean field description 
of the stochastic dynamics we can infer the Jacobian, 
from which by Linear Stability Analysis (LSA) we can 
identify the unstable eigendirections responsible for the 
destruction of the current metastable configuration. 

As will be shown in the following sections, monitoring 
the relationship (vectorial overlap) between the existing 
configuration and the unstable mean field eigendirections 
dangerous directions allows to forecast approaching tran¬ 
sitions with a high accuracy. 


OUTLINE OF FORECASTING PROCEDURE 

In this section we first sketch our approach for then 
in the following two sections to describe in detail how 
to apply the method to the Tangled Nature Model and 
to the replicator system. The first step is to establish 
a mean field approximation of the stochastic dynamics 
in order to obtain a set of deterministic equations. We 
establish the average flow of occupancy between different 
types of individual agents. Define the state vector n (t) = 
(ni(t),..., nd(t)), the mean field time evolution is of the 
form 

n (t + 1) — n (t) = T(n(£))n(£) (1) 

where the matrix T is the mean field evolution matrix, 
which will contain contributions from the following pro¬ 
cesses: death, reproduction and mutation, and n (t) is a 
local time average of the stochastic configuration. 

We can check the accuracy of our mean-field descrip¬ 
tion of the stochastic system by measuring the norm of 
left hand side of Eq. 0, that is ||An(t)||, during the 
simulations and compare it with the norm of the right 
hand side, i.e. ||T(n(t)) • n(t)||. If the difference 

D(sim, meanfield) = ||An(t)|| S i m — ||T(n(t))n(t)|| (2) 

is close to zero the mean field approximation will rep¬ 
resent the stochastic dynamics well, at least in a local 


time and configuration neighbourhood of n(t). This sug¬ 
gests then that we can use Eq. 0 to study local stability 
properties. In Fig. 0 we can see how these 2 quanti¬ 
ties relate in the 2 models. In the Replicator Model (left 
panel) the mean field evolution (black curve) appears to 
be the average of the stochastic evolution (red curve), 
while in the Tangled Nature Model (right panel), they 
differ more clearly. This result depends on the differ¬ 
ent type of dynamics of the models, the Tangled Nature 
being completely stochastic while the Replicator Model 
being more close to a Langevin dynamics. 

Obviously in the mean field approximation the fixed 
point configurations are given as solutions to T(n(£)) • 
n(*), see Eq. 0. Because of the high dimensionality of 
the type of systems we have in mind, this equation will 
typically not be solvable analytically. In any case, the 
stochastic dynamics will not satisfy the fixed point con¬ 
ditions strictly. Rather we’ll expect little time variation 
during a meta stable phase, i.e. n (t + 1) ~ n (t) ~ n*, 
where n* is a local time average of n (t). This means that 
the left hand side of Eq. 0 will be close to zero and that 
n* is essentially a fixed point of the mean field dynam¬ 
ics. We perform a linear stability analysis about n* by 
expanding the right hand side of Eq. 0 . We introduce 
n(t) = n* +5 n(£), expand to first order in 5n(t) get from 
Eq. 0 

Sni(t + 1) - Sni(t) ~ (T(n*) + <9 n Tn*) Sn (3) 
= M(n*)Jn* 

here the matrix 

M(n*) = (T(n*) + <9 n T(n*)n*) (4) 

is the Jacobian of the system, or the stability matrix. 
Now exploiting the results of the LSA, we know that 
the eigenvectors or generalised eigenvectors (in case of a 
non diagonalizable Jacobian) e + associated with A with 
Re (A) > 0 indicate unstable directions. These can be 
identified with toxic components n t of the configuration 
vector. 

What this means is that if the stochastic fluctuations 
bring the system towards these unstable directions, by 
activating the toxic components, the system would feel 
a repulsive force that would push it away from n*. A 
sudden growth of these components would indicate the 
arrival of a transition. This observation allows us to iden¬ 
tify a stability indicator,who’s non-zero values are early 
warning signaling of an approaching transition caused by 
the system leaving the vicinity of a current fixed point. 
The details of this indicators will depend on the specific 
case we are dealing with but will be based on the same 
general idea. 
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Figure 1. In this figure we compare the stochastic model (red curve) with its mean field approximation (black curve). At each 
time step the stochastic configuration is used as input in the mean field equation and the norms of the configuration vector are 
then compared. The top panel is for the Replicator Model while the Tangled Nature is in the right panel. 


In the following sections we will present the two 
model’s, analyzing their basic mechanisms, and develop 
our mean-field stability indicator in both cases. 


THE TANGLED NATURE MODEL 


The model - In the TaNa, an agent is represented by a 
sequence of binary variables with fixed length L [14], de¬ 
noted as S a = (Si, ...,££), where Sf = ±1. Thus, there 
are 2 L different sequences, each one represented by a vec¬ 
tor in the genotype space: S = {— 1,1} L . In a simplis¬ 
tic picture, each of these sequences represents a genome 
uniquely determining the phenotype of all individuals of 
this genotype. We denote by n(S a ,£) the number of in¬ 
dividuals of type S a at time t and the total population is 

N(t) = Yla= i n (S a ,t). We define the distance between 
different genomes S a and S 6 as the Hamming distance: 
d a b = 2 X I S? — S !-|. A time step is defined as a 
succession of one annihilation and of one reproduction 
attempt. During the killing attempt, an individual is 
chosen randomly from the population and killed with a 
probability pkm constant in time and independent of the 
type. During the reproduction process, a different ran¬ 
domly chosen individual S a successfully reproduces with 
probability: p off (S a ,t) = i+expfgVv)) ’ which depends 
on the occupancy distribution of all the types at time t 
via the weight function: 


H(S a , t) = J ( S °’ S >^ S ^ *) - (5) 

In Eq. the first term couples the agent S a to one of 
type S b by introducing the interaction strength J(S a , S 5 ), 


whose values are randomly distributed in the interval 
(—1,+1). For simplicity and to emphasize interactions 
we here assume: J(S a ,S a ) = 0. The parameter k scales 
the interactions strength and /x can be thought of as 
the carrying capacity of the environment. An increase 
(decrease) in p corresponds to harsher (more favorable) 
external conditions. The reproduction is asexual: the 
reproducing agent is removed from the population and 
substituted by two copies SJ and S^, which are subject 
to mutations. A single mutation changes the sign of one 
of the genes: S 7 -> —Sj with probability Pmut • Similarly 
to a Monte Carlo sweep in statistical mechanics, the unit 
of time of our simulations is a generation consisting of 
N(t)/pkui time steps, i.e. the average time needed to kill 
all the individuals at time t. These microscopic rules gen¬ 
erate intermittent macro dynamics. The system is persis¬ 
tently switching between two different modes: the meta¬ 
stable states (denoted quasi-Evolutionary Stable Strate¬ 
gies or qESS) and the transitions separating them. The 
qESS states are characterized by small amplitude fluc¬ 
tuations of N(t) and stable patterns of occupancies of 
the types (Fig. |2j respectively top and bottom panel). 
However, these states are not perfectly stable and config¬ 
urational fluctuations may trigger an abrupt transition 
to a different qESS state. The transitions consist of col¬ 
lective adaptive random walks in the configuration space 
while searching for a new metastable configuration and 
are related to high amplitude fluctuations of N(t). All the 
results we will present for this model have been obtained 
fixing the parameters to L = 8, p mut =0.2, p klU = 0.4, 
K = 40 and fi = 0.07. 
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Figure 2. Left panel: total population as a function of time (in generations) for a single realization of the TaNa. The punctuated 
dynamics is clearly visible: quasi-stable periods alternate with periods of hectic transitions, during which N(t) exhibits large 
amplitude fluctuations. Right panel: occupancy distribution of the types. The genotypes are labelled arbitrarily and a dot 
indicates a type which is occupied at the time t. These figures are obtained with parameters L — 8, p mut = 0.2, p klU = 0.4, 
K — 40 and \i — 0.07. 


Mean Field Description - Tangled Nature Model 

In the TaNa model there are multiple sources 
of stochasticity, namely reproduction, mutations and 
deaths. Following the procedure outlined above we aver¬ 
age out these sources and formulate a deterministic mean 
field equation. At each time step with probability pkiii a 
randomly chosen individual is removed from the system, 
which implies that the occupation number of the species 
it belongs to decreases of 1 unit (An* = — 1). Given that 
the probability of choosing an individual belonging to the 
ith species is pi = ^, the killing term becomes 


PiPkiii(-l)- (6) 

The reproduction term is slightly more complicated given 
the presence of mutations. A randomly chosen individ¬ 
ual is selected for asexual reproduction, which means it 
is removed from the system while creating two new indi¬ 
viduals of the same species. Offsprings can both mutate 
(An* = — 1 ), only one can mutate (An = 0), or none mu¬ 
tate (An = +1). Keeping in mind that the probability 
of reproducing is given by the average contribution 
from mutation is 

PiPf(t ) [2 Po - 1] = a Pi pf(t) (7) 


For a type Hamming distance dij away to be able to 
mutate on to a given type genes will have to mutate 
and this will happen with the binomial probability 

7C$=Pm4(l-Pmut) L - d «. (9) 

Putting together all these effects we find the form of 
Eq.Q for this model, namely 

rii{t + 1) - rii(t) = L J2{ {pf(t) (2 p 0 - 1) - p kl11 ) Sij+ 

( 10 ) 

where 


Tij = ( pf(t ) ( 2 Po - 1) - p ki11 ) % + pf (1 - Sij) 

(n) 

it is mean-field evolution matrix of the system. 

By substituting Eq. (11) into Eq. @ we get the spe¬ 
cific form of the stability matrix for the Tangled Nature 
Model 


My = (apf - p kill )% + 2(1 - <5y)pf P™\ (12) 


+ L[afc + ( 1 -fc)C t i ] 

k 


dpf_ 

dnj 


n 


* 

h- 


here p Q = (1 — p mut ) L is the probability of no mutations 
and a = (2 p Q — 1) is a constant. The third term we have 
to consider is the back-flow effect , which describes the 
event of begin populated by mutations occurring during 
the reproduction happening elsewhere. This term has the 
form 


This is the mean field matrix we use for our linear sta¬ 
bility analysis of the stochastic fixed points. 

THE REPLICATOR MODEL WITH 
STOCHASTICITY 


J2pj(t)pfp??i- 


(8) 


The deterministic version of the replicator dynam¬ 
ics m is used routinely in a large variety of applications, 
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not least because of its relation to game theory mm 
and is therefore expected to be of relevance to the descrip¬ 
tion of e.g. high dimensional socio-economic or biological 
systems. This suggests that if our method works for the 
stochastic replicator the procedure can be of broad rel¬ 
evance as a way to identify and analyse precursors of 
endogenous transitions. 

We are interested in the limit of many strategies played 
by agents that may leave the system (say go bankrupt or 
extinct) or may change their strategy, or mutate. This 
version of the replicator dynamics set-up was studied by 
Tokita and Yasutomi in m- The authors focused on 
the emerging network properties. Here we continue this 
study but with an emphasis on the intermittent nature 
of the macro-dynamics. 

For this model the configuration vector n contains the 
relative frequencies of all the allowed d different frequen¬ 
cies, so the components n^(t) E [0,1] for all i = 1,2,..., d 
but not all frequencies may be occupied at a given mo¬ 
ment, i.e. we can have rq(£) = 0 for some strategy i. We 
start the simulations by generating the d x d payoff ma¬ 
trix J of the game that will tells us the payoffs of every 
pairwise combination. As for the Tangled Nature Model 
above, the matrix J is a random and constant interaction 
network on which the replicator dynamics will be embed¬ 
ded. Each strategy distinguishes itself from the others in 
its payoffs or interactions with the rest of the strategy 
space. 

In this chapter we used the same type of uncorrelated 
interaction matrix as used in the study above of Tangled 
Nature Model. The dimension of the matrix is large, 
namely d E (10 2 ,10 4 ). The qualitative aspects of the 
behavior remains the same for other types of payoff ma¬ 
trices. We found that matrices with payoffs uniformly 
distribute on the interval (—1,1) or on the set {0,1} ex¬ 
hibit the same behavior as matrix of the form used for the 
Tangled Nature Model. However, if the payoffs are drawn 
from a power law distribution with no second moment, 
the dynamics becomes different and the intermittent be¬ 
haviors is not so clear any more. 

In the initial configuration, N 0 d strategies start 
with the same frequency ni = All the other pos¬ 
sible strategies are non active, i.e. the corresponding 
components d — N 0 in n(0) are n*(0) = 0, since no frac¬ 


tion of the players use them. The empty strategies can 
only become populated by one of the active strategies 
mutating into them. Once this happens their frequency 
will evolve according to the replicator equation in which 
these newly occupied strategies interact with the active 
strategies which they are linked to through the matrix J. 

A time step of the replicator dynamics consists in 
calculating the fitness , hi(t) = JY JijUjft) of each ac¬ 
tive strategy and compare it with the average fitness 
h(t) = 'Yhij exactly as expected in a replica¬ 

tor dynamics. Each frequency is then updated according 
to 


i(t+l) = Ui(t)+ J JijTij(t) - y, J rii(t) 

(13) 


The stochastic element, of the otherwise deterministic 
dynamics, consists in the following updates. With prob¬ 
ability p mut each strategy mutates into another one, this 
is done by transferring a fraction <a mut of the frequency 
from the considered strategy to another strategy. The 
label of the latter strategy is chosen in the vicinity of 
the first by use of a normal distribution 7V(i, A) centered 
on label i with variance A. The closer the labels of two 
strategies the more likely it is for one to mutate into the 
other. 

It should be noted that as long as the payoff matrix 
is random and uncorrelated in its indices, no similarity 
criteria between strategies doesn’t really exists ( 2 similar 
strategies interact in a completely different way with the 
environment ). The parameter has been introduced only 
to control the level of disorder of the system. 

When the frequency of a strategy i goes below a preset 
extinction threshold rii(t) < n ext , the strategy is consid¬ 
ered extinct and its frequency is set to zero rii(t + 1) = 0. 
Right after an extinction event the system is immedi¬ 
ately renormalized in order to maintain the condition 


'Ei n i( t ) = 1 - 

The systemic level dynamics exhibit complex dynamics 
as seen from the time evolution of the occupancy vector 
n(*), see in Fig. i>- 

All the results for this model have been obtained with 
the same parameter set, namely: d = 256, n ext = 0.001, 
cw = 0.01, P mut = 0.2. 


Mean Field Description - Replicator Model 

The random mutations are the only source of stochas- 
ticity in the model’s dynamics. To account for these 
stochastic events one has to take into account the pos¬ 
sibility that a strategy looses part of its frequency by 
mutating into other strategies and gaining frequency as 
a result of mutations happening elsewhere. As a result a 


given strategy may loose a fraction of players <a mut , which 
happens with probability p mut or gain Q' imit nj(t+1) which 
happens with probability where N a is the number of 
active strategies. This second effect describes the prob¬ 
ability of being populated by a mutation. We therefore 
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Figure 3. Left panel: occupancy distribution of the types. The genotypes are labelled arbitrarily and a dot indicates a type 
which is occupied at the time t. The punctuated dynamics is clearly visible: quasi-stable periods alternate with periods of 
hectic transitions. Right panel: We present the frequencies of the strategies. Each color belongs to a different strategy. Once 
again the transitions from fixed point to another is clear. 


get the mean field description as 


rij(i+l) ~ rii(t)+ I ^ Jij n j(t) ~ T Jijni(t)rij(t) 1 rii(t) 


^mutPmut^z(^) yy /,\ ^ ^ ^rnut^y (^) (14) 


Na(t) ^ 


which can be expressed, in compact form as 
7ii(t + 1) - rii(t) ~ nj(t) 


(15) 


where 


— I y ) jjjTij (t) y JijTii (t)tij (t) cK mu t J Si 


(16) 


^mut 
2 L - 1 


(1 $ij) 


The stability matrix is obtained by substitution in eq. 


— T^-(n*) + 


Jij ^ T Jki)^-k 


(17) 


FORECASTING PROCEDURE AND RESULTS 

We described in the previous sections how the dy¬ 
namics in the two models consists in intermittent swift 
transitions between quasi-metastable configurations. As 
mentioned in the previous sections we approximate the 
fixed points of the mean field dynamics by local time av¬ 
erages over successive configurations in the quasi-stable 


phases of the full stochastic dynamics, namely: n* = 
T ^2t =o n W? which we will treat as our fixed point. 

Through our procedure we want to study the stability 
in the neighborhood of n*, in order to predict the sys¬ 
tem’s reaction to the stochastic perturbations. To the 
extent that the mean field matrix correctly describes the 
system the metastable states will become unstable along 
directions in configuration space given by the eigenvec¬ 
tors e + corresponding to eigenvalues with a positive real 
part Re(A) > 0. 

Once we know the form of the eigenspace we can mon¬ 
itor two important scalar quantities: the instantaneous 
distance from the fixed point 


Sn(t) = ||<5n(t)|| = ||n(t) - n*| 


(18) 


and the maximum overlap between the perturbation and 
the eigenvectors {e + } of the unstable subspace 

Q(t) = |<5n(f)e 

i | max (19) 

The quantity in eq. (p~8]) tells us how far away the sys¬ 
tem is from the fixed point. If no unstable directions 
exist the system will be expected to stays in the vicinity 
of the fixed point and hence we expect Sn(t) to fluctuate 
around a low constant value, while a transition would in¬ 


duce a sudden increase in Sn(t). The overlap in eq. (19) 
tells us to what extent a deviation n (t) — n* is within an 
unstable sub space. 

Another way of picturing Q(t) is as a measure of the 
activity of the occupancy on dangerous nodes. Indeed 
every non zero component of the unstable eigenvectors 
{e + } will tell us which nodes of the interaction network 
are capable of pushing the system out of its metastable 
configuration. Namely if e\ > 0, where j indicates the 
component of the unstable eigenvector, this means the 
the jth node is dangerous. 

The Q(t ) monitor the activity of such nodes. If one 
of these nodes were to become activated by mutations 
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Figure 4. In the bottom panel of this figure we show the 
behavior of Sn(t) (blue curve) and Q(t) (red curve) while ap¬ 
proaching the transition in the Tangled Nature. In the top 
panel a weighted occupation plot is presented. We can see 
how the beginning of the transitions (dashed vertical black 
line) is triggered by a new mutant (black arrow) that quickly 
gains population. The arrival of the new dangerous mutant 
is singled by a peak in the Q(t). 



Figure 5. This is the same type of figure showed in Fig.Q for 
the Replicator Model. Bottom panel Sn(t) and Q(t), blue and 
red curve respectively, top panel weighted occupation plot. 
We can see how even in this model the transition is triggered 
but the arrival of a new fit mutant that my gaining weight 
disturbs the existing equilibrium. 


(which the stochastic perturbations) this would result 
into a rapid growth of Q(t) and can be considered as 
a warning of successive transition. 

In ^ in Fig. (3) it was discussed how these two quanti¬ 
ties behave in the TaNa model and we demonstrated the 
forecasting power of the indicator Q(t) and we gave an 
explanation on why we missed some of the transitions. 

Here we illustrate in Fig. <§ and Fig. @ the temporal 
behaviour of Q(t) and 5n(t) for both the Tangled Na¬ 


ture Model and the stochastic replicator system. The 
top panels contain weighted occupation plots while the 
bottom figures show the behavior the two quantities in 
Q(t) and Sn(t). The arrow points at the new dangerous 
mutant that has entered the system, while the dashed bar 
indicates the moment it happens. Before the dashed line 
we can see how fluctuations in 5n(t) are bounded and 
Q(t) essentially equals to zero. After the dashed line, 
when the new mutant has entered the system, we see an 
explosion of both quantities. 

We denote t* the time at which the transition begins, 
which is set by the 8n(t) crossing a reasonably chosen 
threshold T$ and staying consistently above this thresh¬ 
old (we have used T$ = 150 for the TaNa and T$ = 0.05 
for the Replicator Model). Given the sharp increase of 
5n(t) when approaching the transition, t* doesn’t depend 
strongly on the precise choice of the threshold as long as 
its is chosen larger than the characteristic fluctuations of 
5n(t) during the metastable configurations. . 

To define an alarm we determine an appropriate 
threshold Aq on Q(t). To do so we compare the num¬ 
ber of false alarms with the number of missed transitions 
generated by different values of the chosen threshold Aq . 
We define a false alarm when the Q(t) crosses Aq but 
then goes back under it before any transition occurs. On 
the other hand a missed transition corresponds to situ¬ 
ations where Q(t) remained below Aq even though the 
given metastable configuration did become unstable and 
therefore a transition did occur. 

In Fig.(j6| we show these two quantities for different 
Aq. The red curve is the fraction of missed transitions 
while the blu is the fraction of transitions that have 
produced false alarms. In the Tangled Nature Model 
when increasing Aq the fraction of false alarms decreases, 
as expected, while the fraction of missed transitions in¬ 
creases. The same figure for the Replicator Model shows 
how the procedure, although missing an increasing num¬ 
ber of transitions, produce no false alarms at all. 

The reason for this, we believe has to do with the 
Langevin nature of the dynamics in the Replicator 
Model, i.e. deterministic dynamics + stochastic noise. 
Within this approach we expand the configuration vec¬ 
tor n (t) in the M’s eigenspace or generalized eigenspace 
plus noise. One gets 

n(i) = T] ( c fc(0) ex P(V^) e fc + e k ) (20) 

k 

where Cfc(O) are the coefficients of the expansion and e& is 
the noise. This dynamics is clearly dominated by those 
components for which Re(A&) > 0, but this is true only 
if <*(0) 7 ^ 0. When a node is populated by a mutation, 
in our framework this corresponds to setting Ck(t) > 0. 
From then on the term is suppressed if and only if the 
€k points in the opposite direction at all times which is 
highly unlikely. The same picture is less applicable to 
the Tangled Nature where all updates are stochastic and 

































hence the separation in to a robust deterministic part perturbed by a weak stochastic part is problematic. 
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Figure 6. We can see the behaviour of the fraction of false alarms and missed transitions for different values of Aq in the 
Replicator Model (left panel) and the Tangled Nature (right panel). One can see how the procedure produced no false alarms 
in the Replicator Model which is consistent with what we expected given the Langevin nature of the model. 


The way to interpret the missed transitions is to think 
of the fixed points as local minima of a heterogeneous 
high dimensional energy landscape. The eigenspace of 
the mean field matrix tells where the downhill slopes and 
uphill barriers are. Although it is far more likely for the 
system to leave the fixed point through a downhill slope, 
a stochastic perturbation may be able to push the system 
over a barrier. This interpretation is confirmed by Fig. 0 
where we can see that the fraction of missed transitions 
increases in both models as the degree of stochasticity is 
increased. 

Once one fixes Aq we can check the number of time 
steps, AT = || t* — t cross ||, prior to Q(t) goes above Aq. 


In this way we can check the forecasting power of the 
indicator. In Fig.(|8| we present the distribution of AT 
for Aq = 0.01 and Aq = 20 respectively for the Repli¬ 
cator and the Tangled Nature Model. We can see that 
in the Replicator Model the crossing times are tenths of 
time steps before the transition time. This means that 
the system will go through many cycles of updates be¬ 
fore the transition occurs. In the Tangled Nature in more 
than 50% of cases AT G [2, 5]. As explained above when 
introducing the model, one generation corresponds to av¬ 
erage number of time steps necessary to remove everyone 
from the system, i.e. ^rr individual updates. So even 
low values of AT can be considered to correspond to a 
strong forecasting power. 


INCOMPLETE KNOWLEDGE 


An obvious short coming concerning application to real 
situations of the forecasting procedure as described so far 
is that we make use of complete knowledge of the entire 
(both the actually realized and the ”in potentia” part 
of) space of agents and their interactions. In this sec¬ 
tion we first consider how the lack of full knowledge of 
the interaction strength between agents influences our 
ability to detect approaching transitions. We next con¬ 
sider a much simpler measure than the overlap function 
Q(t). This new measure is inspired by the analysis pre¬ 
sented above and leading to Q(t) but avoids access to 
information about the adjacent possible, i.e. information 


about agents that are not extant in the system at the tie 
of forecasting. Our new measure only makes use of the 
time evolution of directly observable quantities and can 
therefore in principle be applied without the need of a 
dynamical model of the considered system. 

We investigate the effect of lack of complete informa¬ 
tion concerning the iterations between agents by intro¬ 
ducing an error in the interaction matrix used for the 
mean field treatment. We do this in the following way 

•/(i = y m + \ (2i) 

where x is iV(0, <r), i.e. a normally distributed random 
variable, of mean 0 and variance a. We then repeat the 
exact same procedure outlined in the previous section but 
using JJ e in the calculations. 
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Figure 7. We present in this figure for both models the 
fraction of missed transitions as a function of the noise in 
the system. We can see how for nosier systems its harder to 
forecast a transition. 



Figure 8. Distribution of the respite of the alarms for a given threshold. The left panel refers to the Replicator Model, for 
which Aq = 0.01 and the right panel to the Tangled Nature Model, for which Aq = 20. 




Figure 9. In this figure we show the fraction of the transitions we are not able to forecast and the fractions of false positive, in 
function of the a of the distribution of the random error in the interactions. Once again we have used Aq = 30 for the Tangled 
Nature (right panel) and Aq = 0.01 of the Replicator Model (left panel). 


In Fig.Q we present the fractions of transitions we are not able to forecast and the fractions of false alarms we 





















10 





Figure 10. Top left and bottom left respectively occupation plot and total numbers of individual = N(t ) in the 

Tangled Nature Model. The vertical red lines represent the alarm times. In the top and bottom right we compare the behavior 
of the occupation plot and the frequencies of the most occupied strategies (blue curves) in the Replicator model with the alarms 
given by our new procedure . One can clearly see how after every alarm the system changes its configuration. 


generate as function of the variance cr, i.e. as function 
of how much the interaction matrix used for the stability 
analysis differs from the correct set of interactions. For 
the Tangled Nature (see the right panel) we can notice 
that for a < 0.2 we are still able to forecast around 70% 
of the transitions and we generate less than 20% of false 
alarms. This is an encouraging result since a a = 0.2 is 
clearly a significant error given that Jy G (—1,1). A very 
similar result holds for the Replicator Model. 

We now discuss a forecasting procedure that doesn’t 
need any knowledge about ”in potentia” agents. We only 
need to focus on the highly occupied nodes present in the 
system. We only know what we see without making any 
use of the non active part of the interaction network, nor 
of the poorly occupied nodes. 

By applying the LSA to the occupied network we can 
check that, during a stable phase, the configuration cor¬ 
responds to a situation where the spectrum of the M 
consists of eigenvalues that all have negative real parts. 
As the system evolves new mutants appear. As an in¬ 
dicator of approaching transitions we track the growths 
of the occupancy of these new agents, if their occupancy 
exceeds a certain threshold we check the spectrum of the 
updated M, in which the new agents are included. In 


case the spectrum now includes positive eigenvalue we 
take this as an indicator of, an approaching transition 
out of the present metastable configuration. This will be 
our new alarm. 


In Fig. (10) we show the results of an application of this 
new procedure. In both panels the red vertical lines indi¬ 
cate the times of appearance of a species able to change 
the stability of the system. We can qualitatively see from 
the figure that just after the alarms the system actually 
undergoes a transition. 

In the left panel of the Fig. (10) the blue curves rep¬ 
resent the frequencies of the most occupied strategies in 
the Replicator model. We can see how right after the 
red lines, the alarm times, a new strategy starts gain¬ 
ing frequency and eventually puts an end to the stable 
configuration. 

In the right panel we show the total number of indi¬ 
viduals present in the system N(t) = JA Uj(t). A transi¬ 
tion to a new metastable configuration is associated with 
a sudden change of this quantity. We notice that after 
each alarm N(t) exhibit a significant change. Preliminary 
analysis indicates that this procedure is able to forecast 
transitions with an accuracy similar Q(t) indicator. Fur¬ 
ther investigation of the efficiency and reliability of using 
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the grows rate of new comers as indicators of approach¬ 
ing transitions is underway. Obviously this can make our 
procedure more readily applicable to real systems since 
we would then only need directly observable information. 


SUMMARY AND CONCLUSION 

We have describe a new procedure for forecasting tran¬ 
sitions in high dimensional systems with stochastic dy¬ 
namics. Our method is of relevance to systems where the 
macroscopic dynamics at the systemic level is not ade¬ 
quately captured by a well defined set of essentially deter¬ 
ministic collective variables (e.g. as handled by Langevin 
equations). Hence we are dealing with situations that 
are not captured by the application of bifurcation the¬ 
ory such as considered by Scheffer and collaborators 0 - 

[4]. We have in mind complex systems in which the dy¬ 
namics involves some evolutionary aspects, in particular 
situations where the dynamics generates new degrees of 
freedom. E.g. biological evolution, or economical and 
financial systems, where new agents (organisms, strate¬ 
gies or companies, say) are produced as an intrinsic part 
of the dynamics. We have demonstrated by use of two 
models of varying degree of stochasticity (the Tangled 
Nature Model and the stochastic Replicator Model) that 
a combination of analytic linear stability analysis and 
simulation allows one to construct a signal (overlap with 
unstable directions) which can be used to forecast a very 
high percentage of all transitions. 

The weakness of the procedure is that for real situa¬ 
tions of interest (e.g. an ecosystem or a financial market) 
one may obviously not posses complete information. One 
will typically not have access to all the information about 
the interaction amongst the agents. This turns out to be 
less of a problem, since we can show that even with a 
10% inaccuracy in interaction strengths, we are still able 
to forecast a substantial percentage of transitions. An¬ 
other short coming is that in real situations it can also be 
very difficult to know the nature of the new agents that 
may arrive as the system evolve. Our full mathemati¬ 
cal procedure suggests a way to overcome this problem. 
Namely, the eigenvector analysis showed that transitions 
are often accompanied by the arrival of new agents, which 
exhibit a rapid growth in their relative systemic weight. 
We found that simply monitoring the rapidly growing 
new agents can enable prediction of major systemic up¬ 
heavals. I.e. approaching transitions might not be appar¬ 
ent by focusing on the systemic heavyweights, but rather 
one should keep a keen eye on the tiny components to 
monitor whether they suddenly start to flourish. This 
can often be the signal of upcoming systemic changes. 
Our next step will be to test these findings on real data 
streams including high frequency financial time series. 
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